Search CORE

16 research outputs found

Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence

Author: Cheung Joseph
Estivill Xavier
Khaja Razi
Lau Ken
MacDonald Jeffrey R
Scherer Stephen W
Tsui Lap-Chee
Publication venue: BioMed Central
Publication date: 01/01/2003
Field of study

BACKGROUND: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. RESULTS: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. CONCLUSION: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve

Springer - Publisher Connector

PubMed Central

UPF Digital Repository

HKU Scholars Hub

On the association between chromosomal rearrangements and genic evolution in humans and chimpanzees

Author: Armengol Lluís
Bertranpetit Jaume
Gazave Elodie
Khaja Razi
Lopez-Bigas Núria
Marques-Bonet Tomàs
Navarro Arcadi
Rocchi Mariano
Sànchez-Ruiz Jesús
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Analysis of the genes located in rearranged human and chimpanzee chromosomes identified lower divergence than for those in colinear chromosomes

Crossref

Springer - Publisher Connector

PubMed Central

Archivio istituzionale della ricerca - Università di Bari

UPF Digital Repository

Discovery of Human Inversion Polymorphisms by Comparative Analysis of Human and Chimpanzee DNA Sequence Assemblies

Author: Andrew R Carson
Barbara Trask
Girish Rao
Jeffrey R MacDonald
Lars Feuk
Martin Li
Razi Khaja
Stephen W Scherer
Terence Tang
The International HapMap Consortium
Publication venue: Public Library of Science
Publication date: 01/01/2005
Field of study

With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb) in length, 75% were flanked on one or both sides by (often unrelated) segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85%) semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13%) regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22), 13 kb (at 7q11), and 1 kb (at 16q24) fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized

Crossref

Directory of Open Access Journals

PubMed Central

Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence

Author: Cheung Joseph
Estivill Xavier
Khaja Razi
Lau Ken
MacDonald Jeffrey R
Scherer Stephen W
Tsui Lap-Chee
Publication venue: University of Toronto
Publication date: 27/03/2018
Field of study

Abstract Background Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. Results Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants. Conclusion Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve

University of Toronto Research Repository

Murine segmental duplications are hot spots for chromosome and gene evolution

Author: Armengol Lluís
Cheung Joseph
Estivill Xavier
González Juan R.
Khaja Razi
Marquès-Bonet Tomàs
Navarro Arcadi
Scherer Stephen W.
Publication venue: Elsevier Inc.
Publication date: 31/12/2005
Field of study

AbstractMouse and rat genomic sequences permit us to obtain a global view of evolutionary rearrangements that have occurred between the two species and to define hallmarks that might underlie these events. We present a comparative study of the sequence assemblies of mouse and rat genomes and report an enrichment of rodent-specific segmental duplications in regions where synteny is not preserved. We show that segmental duplications present higher rates of molecular evolution and that genes in rearranged regions have evolved faster than those located elsewhere. Previous studies have shown that synteny breakpoints between the mouse and the human genomes are enriched in human segmental duplications, suggesting a causative connection between such structures and evolutionary rearrangements. Our work provides further evidence to support the role of segmental duplications in chromosomal rearrangements in the evolution of the architecture of mammalian chromosomes and in the speciation processes that separate the mouse and the rat

Elsevier - Publisher Connector

Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence

Author: Cheung Joseph
Estivill Xavier, 1955-
Khaja Razi
Lau Ken
MacDonald Jeffrey R
Scherer Stephen W.
Tsui Lap-Chee
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Background: Previous studies have suggested that recent segmental duplications, which are often involved in chromosome rearrangements underlying genomic disease, account for some 5% of the human genome. We have developed rapid computational heuristics based on BLAST analysis to detect segmental duplications, as well as regions containing potential sequence misassignments in the human genome assemblies. Results: Our analysis of the June 2002 public human genome assembly revealed that 107.4 of 3,043.1 megabases (Mb) (3.53%) of sequence contained segmental duplications, each with size equal or more than 5 kb and 90% identity. We have also detected that 38.9 Mb (1.28%) of sequence within this assembly is likely to be involved in sequence misassignment errors. Furthermore, we have identified a significant subset (199,965 of 2,327,473 or 8.6%) of single-nucleotide polymorphisms (SNPs) in the public databases that are not true SNPs but are potential paralogous sequence variants./nConclusion: Using two distinct computational approaches, we have identified most of the sequences in the human genome that have undergone recent segmental duplications. Near-identical segmental duplications present a major challenge to the completion of the human genome sequence. Potential sequence misassignments detected in this study would require additional efforts to resolve

RECERCAT

Recent segmental and gene duplications in the mouse genome

Author: Cheung Joseph
Heng Henry H
Khaja Razi
Koop Ben F
MacDonald Jeffrey R
Scherer Stephen W
Wilson Michael D
Zhang Junjun
Publication venue
Publication date: 27/03/2018
Field of study

Abstract Background The high quality of the mouse genome draft sequence and its associated annotations are an invaluable biological resource. Identifying recent duplications in the mouse genome, especially in regions containing genes, may highlight important events in recent murine evolution. In addition, detecting recent sequence duplications can reveal potentially problematic regions of the genome assembly. We use BLAST-based computational heuristics to identify large (≥ 5 kb) and recent (≥ 90% sequence identity) segmental duplications in the mouse genome sequence. Here we present a database of recently duplicated regions of the mouse genome found in the mouse genome sequencing consortium (MGSC) February 2002 and February 2003 assemblies. Results We determined that 33.6 Mb of 2,695 Mb (1.2%) of sequence from the February 2003 mouse genome sequence assembly is involved in recent segmental duplications, which is less than that observed in the human genome (around 3.5-5%). From this dataset, 8.9 Mb (26%) of the duplication content consisted of 'unmapped' chromosome sequence. Moreover, we suspect that an additional 18.5 Mb of sequence is involved in duplication artifacts arising from sequence misassignment errors in this genome assembly. By searching for genes that are located within these regions, we identified 675 genes that mapped to duplicated regions of the mouse genome. Sixteen of these genes appear to have been duplicated independently in the human genome. From our dataset we further characterized a 42 kb recent segmental duplication of Mater, a maternal-effect gene essential for embryogenesis in mice. Conclusion Our results provide an initial analysis of the recently duplicated sequence and gene content of the mouse genome. Many of these duplicated loci, as well as regions identified to be involved in potential sequence misassignment errors, will require further mapping and sequencing to achieve accuracy. A Genome Browser database was set up to display the identified duplication content presented in this work. This data will also be relevant to the growing number of investigators who use the draft genome sequence for experimental design and analysis

University of Toronto Research Repository